home *** CD-ROM | disk | FTP | other *** search
- From: chughes@maths.tcd.ie (Conrad Hughes)
-
- >WANTED: 32/64 bit square root algorithms in Assembly.
- Here's the best code I've managed yet; it manages to fit into four
- instructions per bit. If anyone has managed better, please let me
- know.. If you do use the code, it'd be nice to receive a mention
- :-)
- DEFFNsqrt(N,R,T,S):LOCALA%:IFS<>N [OPTpass:MOV S,N:]
- [OPTpass:MOV R,#0:]:FORA%=7TO0STEP-1:[OPTpass:ADD T,R,#1<<A%+15
- CMP S,T,LSL#A%+1:SUBGE S,S,T,LSL#A%+1:ADDGE R,R,#1<<A%+16:]NEXT:[OPTpass
- ADD T,R,#1<<14:CMP S,T:SUBGE S,S,T:ADDGE R,R,#1<<15:]FORA%=2TO15:[OPTpass
- ADD T,R,#1<<15-A%:RSBS S,T,S,LSL#1:ADDLT S,S,T:ADDGE R,R,#1<<16-A%:]NEXT
- [OPTpass:ADD T,R,R:ADD T,T,#1:RSBS S,T,S,LSL#2:ADDLT S,S,T:ADDGE R,R,#1
- CMP S,R:ADDGE R,R,#1:]=0
- ..defines a function which calculates the square root of a number
- stored in one register, with the fixed point between bits 15 and 16;
- I didn't write down behaviour for negative numbers, but you should
- be able to test this out for yourself, and it shouldn't be too
- difficult to extend to different formats. To convert a number into
- this format, you just multiply it by 65536, and to convert back, you
- divide by 65536 (i.e., 1.000 would be represented as 65536 in a
- register).
-
- So, for example, you want R1=SQRT(R0):
- FNsqrt(0,1,2,3) will preserve the contents of R0, and use R2 and R3
- for data - both will be corrupted.
- FNsqrt(0,1,2,0) will corrupt R0, but doesn't need R3.
- There are no other circumstances under which two of the registers
- the routine uses can safely be the same.
-
- The loops are unwrapped for speed (actually using loops would take
- up much less space, but would probably use up another register, and
- make the routine twice as slow). The routine takes up 400 or 404
- bytes, depending on whether N=S or not.
-
- One thing - this is slow :-}, so wherever possible you should avoid
- using square roots.. Similarly division.
-
-
-